{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "intro_to_pandas.ipynb",
      "version": "0.3.2",
      "views": {},
      "default_view": {},
      "provenance": [
        {
          "file_id": "/v2/external/notebooks/mlcc/intro_to_pandas.ipynb",
          "timestamp": 1528022507361
        }
      ],
      "collapsed_sections": []
    }
  },
  "cells": [
    {
      "metadata": {
        "id": "JndnmDMp66FL",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "#### Copyright 2017 Google LLC."
      ]
    },
    {
      "metadata": {
        "id": "hMqWDc_m6rUC",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        },
        "cellView": "both"
      },
      "cell_type": "code",
      "source": [
        "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "rHLcriKWLRe4",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " # Pandas 简介"
      ]
    },
    {
      "metadata": {
        "id": "QvJBqX8_Bctk",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "**学习目标：**\n",
        "  * 大致了解 *pandas* 库的 `DataFrame` 和 `Series` 数据结构\n",
        "  * 存取和处理 `DataFrame` 和 `Series` 中的数据\n",
        "  * 将 CSV 数据导入 pandas 库的 `DataFrame`\n",
        "  * 对 `DataFrame` 重建索引来随机打乱数据"
      ]
    },
    {
      "metadata": {
        "id": "TIFJ83ZTBctl",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " [*pandas*](http://pandas.pydata.org/) 是一种列存数据分析 API。它是用于处理和分析输入数据的强大工具，很多机器学习框架都支持将 *pandas* 数据结构作为输入。\n",
        "虽然全方位介绍 *pandas* API 会占据很长篇幅，但它的核心概念非常简单，我们会在下文中进行说明。有关更完整的参考，请访问 [*pandas* 文档网站](http://pandas.pydata.org/pandas-docs/stable/index.html)，其中包含丰富的文档和教程资源。"
      ]
    },
    {
      "metadata": {
        "id": "s_JOISVgmn9v",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ## 基本概念\n",
        "\n",
        "以下行导入了 *pandas* API 并输出了相应的 API 版本："
      ]
    },
    {
      "metadata": {
        "id": "aSRYu62xUi3g",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "base_uri": "https://localhost:8080/",
          "height": 34
        },
        "outputId": "f9b2b479-12f9-4392-e49a-36d2d12ab7de",
        "executionInfo": {
          "status": "ok",
          "timestamp": 1528027858792,
          "user_tz": -480,
          "elapsed": 1550,
          "user": {
            "displayName": "陈方杰",
            "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
            "userId": "115418685205938795893"
          }
        }
      },
      "cell_type": "code",
      "source": [
        "import pandas as pd\n",
        "pd.__version__"
      ],
      "execution_count": 16,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "u'0.22.0'"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 16
        }
      ]
    },
    {
      "metadata": {
        "id": "daQreKXIUslr",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " *pandas* 中的主要数据结构被实现为以下两类：\n",
        "\n",
        "  * **`DataFrame`**，您可以将它想象成一个关系型数据表格，其中包含多个行和已命名的列。\n",
        "  * **`Series`**，它是单一列。`DataFrame` 中包含一个或多个 `Series`，每个 `Series` 均有一个名称。\n",
        "\n",
        "数据框架是用于数据操控的一种常用抽象实现形式。[Spark](https://spark.apache.org/) 和 [R](https://www.r-project.org/about.html) 中也有类似的实现。"
      ]
    },
    {
      "metadata": {
        "id": "fjnAk1xcU0yc",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " 创建 `Series` 的一种方法是构建 `Series` 对象。例如："
      ]
    },
    {
      "metadata": {
        "id": "DFZ42Uq7UFDj",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "base_uri": "https://localhost:8080/",
          "height": 84
        },
        "outputId": "d89e5330-3a2d-4e77-f0d9-bb9a895b863a",
        "executionInfo": {
          "status": "ok",
          "timestamp": 1528027897144,
          "user_tz": -480,
          "elapsed": 1671,
          "user": {
            "displayName": "陈方杰",
            "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
            "userId": "115418685205938795893"
          }
        }
      },
      "cell_type": "code",
      "source": [
        "pd.Series(['San Francisco', 'San Jose', 'Sacramento'])"
      ],
      "execution_count": 17,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "0    San Francisco\n",
              "1         San Jose\n",
              "2       Sacramento\n",
              "dtype: object"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 17
        }
      ]
    },
    {
      "metadata": {
        "id": "U5ouUp1cU6pC",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " 您可以将映射 `string` 列名称的 `dict` 传递到它们各自的 `Series`，从而创建`DataFrame`对象。如果 `Series` 在长度上不一致，系统会用特殊的 [NA/NaN](http://pandas.pydata.org/pandas-docs/stable/missing_data.html) 值填充缺失的值。例如："
      ]
    },
    {
      "metadata": {
        "id": "avgr6GfiUh8t",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "base_uri": "https://localhost:8080/",
          "height": 136
        },
        "outputId": "e4d1cce6-7042-426b-fc93-9943dd83e8bd",
        "executionInfo": {
          "status": "ok",
          "timestamp": 1528027901578,
          "user_tz": -480,
          "elapsed": 1361,
          "user": {
            "displayName": "陈方杰",
            "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
            "userId": "115418685205938795893"
          }
        }
      },
      "cell_type": "code",
      "source": [
        "city_names = pd.Series(['San Francisco', 'San Jose', 'Sacramento'])\n",
        "population = pd.Series([852469, 1015785, 485199])\n",
        "\n",
        "pd.DataFrame({ 'City name': city_names, 'Population': population })"
      ],
      "execution_count": 18,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>City name</th>\n",
              "      <th>Population</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>San Francisco</td>\n",
              "      <td>852469</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>San Jose</td>\n",
              "      <td>1015785</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>Sacramento</td>\n",
              "      <td>485199</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "       City name  Population\n",
              "0  San Francisco      852469\n",
              "1       San Jose     1015785\n",
              "2     Sacramento      485199"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 18
        }
      ]
    },
    {
      "metadata": {
        "id": "oa5wfZT7VHJl",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " 但是在大多数情况下，您需要将整个文件加载到 `DataFrame` 中。下面的示例加载了一个包含加利福尼亚州住房数据的文件。请运行以下单元格以加载数据，并创建特征定义："
      ]
    },
    {
      "metadata": {
        "id": "av6RYOraVG1V",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "base_uri": "https://localhost:8080/",
          "height": 284
        },
        "outputId": "836dcf5a-539c-40ec-a821-93429e7932df",
        "executionInfo": {
          "status": "ok",
          "timestamp": 1528027923363,
          "user_tz": -480,
          "elapsed": 1415,
          "user": {
            "displayName": "陈方杰",
            "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
            "userId": "115418685205938795893"
          }
        }
      },
      "cell_type": "code",
      "source": [
        "california_housing_dataframe = pd.read_csv(\"https://storage.googleapis.com/mledu-datasets/california_housing_train.csv\", sep=\",\")\n",
        "california_housing_dataframe.describe()"
      ],
      "execution_count": 19,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>longitude</th>\n",
              "      <th>latitude</th>\n",
              "      <th>housing_median_age</th>\n",
              "      <th>total_rooms</th>\n",
              "      <th>total_bedrooms</th>\n",
              "      <th>population</th>\n",
              "      <th>households</th>\n",
              "      <th>median_income</th>\n",
              "      <th>median_house_value</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>count</th>\n",
              "      <td>17000.000000</td>\n",
              "      <td>17000.000000</td>\n",
              "      <td>17000.000000</td>\n",
              "      <td>17000.000000</td>\n",
              "      <td>17000.000000</td>\n",
              "      <td>17000.000000</td>\n",
              "      <td>17000.000000</td>\n",
              "      <td>17000.000000</td>\n",
              "      <td>17000.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>mean</th>\n",
              "      <td>-119.562108</td>\n",
              "      <td>35.625225</td>\n",
              "      <td>28.589353</td>\n",
              "      <td>2643.664412</td>\n",
              "      <td>539.410824</td>\n",
              "      <td>1429.573941</td>\n",
              "      <td>501.221941</td>\n",
              "      <td>3.883578</td>\n",
              "      <td>207300.912353</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>std</th>\n",
              "      <td>2.005166</td>\n",
              "      <td>2.137340</td>\n",
              "      <td>12.586937</td>\n",
              "      <td>2179.947071</td>\n",
              "      <td>421.499452</td>\n",
              "      <td>1147.852959</td>\n",
              "      <td>384.520841</td>\n",
              "      <td>1.908157</td>\n",
              "      <td>115983.764387</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>min</th>\n",
              "      <td>-124.350000</td>\n",
              "      <td>32.540000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>2.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>3.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>0.499900</td>\n",
              "      <td>14999.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>25%</th>\n",
              "      <td>-121.790000</td>\n",
              "      <td>33.930000</td>\n",
              "      <td>18.000000</td>\n",
              "      <td>1462.000000</td>\n",
              "      <td>297.000000</td>\n",
              "      <td>790.000000</td>\n",
              "      <td>282.000000</td>\n",
              "      <td>2.566375</td>\n",
              "      <td>119400.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>50%</th>\n",
              "      <td>-118.490000</td>\n",
              "      <td>34.250000</td>\n",
              "      <td>29.000000</td>\n",
              "      <td>2127.000000</td>\n",
              "      <td>434.000000</td>\n",
              "      <td>1167.000000</td>\n",
              "      <td>409.000000</td>\n",
              "      <td>3.544600</td>\n",
              "      <td>180400.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>75%</th>\n",
              "      <td>-118.000000</td>\n",
              "      <td>37.720000</td>\n",
              "      <td>37.000000</td>\n",
              "      <td>3151.250000</td>\n",
              "      <td>648.250000</td>\n",
              "      <td>1721.000000</td>\n",
              "      <td>605.250000</td>\n",
              "      <td>4.767000</td>\n",
              "      <td>265000.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>max</th>\n",
              "      <td>-114.310000</td>\n",
              "      <td>41.950000</td>\n",
              "      <td>52.000000</td>\n",
              "      <td>37937.000000</td>\n",
              "      <td>6445.000000</td>\n",
              "      <td>35682.000000</td>\n",
              "      <td>6082.000000</td>\n",
              "      <td>15.000100</td>\n",
              "      <td>500001.000000</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "          longitude      latitude  housing_median_age   total_rooms  \\\n",
              "count  17000.000000  17000.000000        17000.000000  17000.000000   \n",
              "mean    -119.562108     35.625225           28.589353   2643.664412   \n",
              "std        2.005166      2.137340           12.586937   2179.947071   \n",
              "min     -124.350000     32.540000            1.000000      2.000000   \n",
              "25%     -121.790000     33.930000           18.000000   1462.000000   \n",
              "50%     -118.490000     34.250000           29.000000   2127.000000   \n",
              "75%     -118.000000     37.720000           37.000000   3151.250000   \n",
              "max     -114.310000     41.950000           52.000000  37937.000000   \n",
              "\n",
              "       total_bedrooms    population    households  median_income  \\\n",
              "count    17000.000000  17000.000000  17000.000000   17000.000000   \n",
              "mean       539.410824   1429.573941    501.221941       3.883578   \n",
              "std        421.499452   1147.852959    384.520841       1.908157   \n",
              "min          1.000000      3.000000      1.000000       0.499900   \n",
              "25%        297.000000    790.000000    282.000000       2.566375   \n",
              "50%        434.000000   1167.000000    409.000000       3.544600   \n",
              "75%        648.250000   1721.000000    605.250000       4.767000   \n",
              "max       6445.000000  35682.000000   6082.000000      15.000100   \n",
              "\n",
              "       median_house_value  \n",
              "count        17000.000000  \n",
              "mean        207300.912353  \n",
              "std         115983.764387  \n",
              "min          14999.000000  \n",
              "25%         119400.000000  \n",
              "50%         180400.000000  \n",
              "75%         265000.000000  \n",
              "max         500001.000000  "
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 19
        }
      ]
    },
    {
      "metadata": {
        "id": "WrkBjfz5kEQu",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " 上面的示例使用 `DataFrame.describe` 来显示关于 `DataFrame` 的有趣统计信息。另一个实用函数是 `DataFrame.head`，它显示 `DataFrame` 的前几个记录："
      ]
    },
    {
      "metadata": {
        "id": "s3ND3bgOkB5k",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "base_uri": "https://localhost:8080/",
          "height": 195
        },
        "outputId": "65520b8f-c9cd-43dc-fd20-5c6300a950d9",
        "executionInfo": {
          "status": "ok",
          "timestamp": 1528027926502,
          "user_tz": -480,
          "elapsed": 1166,
          "user": {
            "displayName": "陈方杰",
            "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
            "userId": "115418685205938795893"
          }
        }
      },
      "cell_type": "code",
      "source": [
        "california_housing_dataframe.head()"
      ],
      "execution_count": 20,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>longitude</th>\n",
              "      <th>latitude</th>\n",
              "      <th>housing_median_age</th>\n",
              "      <th>total_rooms</th>\n",
              "      <th>total_bedrooms</th>\n",
              "      <th>population</th>\n",
              "      <th>households</th>\n",
              "      <th>median_income</th>\n",
              "      <th>median_house_value</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>-114.31</td>\n",
              "      <td>34.19</td>\n",
              "      <td>15.0</td>\n",
              "      <td>5612.0</td>\n",
              "      <td>1283.0</td>\n",
              "      <td>1015.0</td>\n",
              "      <td>472.0</td>\n",
              "      <td>1.4936</td>\n",
              "      <td>66900.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>-114.47</td>\n",
              "      <td>34.40</td>\n",
              "      <td>19.0</td>\n",
              "      <td>7650.0</td>\n",
              "      <td>1901.0</td>\n",
              "      <td>1129.0</td>\n",
              "      <td>463.0</td>\n",
              "      <td>1.8200</td>\n",
              "      <td>80100.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>-114.56</td>\n",
              "      <td>33.69</td>\n",
              "      <td>17.0</td>\n",
              "      <td>720.0</td>\n",
              "      <td>174.0</td>\n",
              "      <td>333.0</td>\n",
              "      <td>117.0</td>\n",
              "      <td>1.6509</td>\n",
              "      <td>85700.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>-114.57</td>\n",
              "      <td>33.64</td>\n",
              "      <td>14.0</td>\n",
              "      <td>1501.0</td>\n",
              "      <td>337.0</td>\n",
              "      <td>515.0</td>\n",
              "      <td>226.0</td>\n",
              "      <td>3.1917</td>\n",
              "      <td>73400.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>-114.57</td>\n",
              "      <td>33.57</td>\n",
              "      <td>20.0</td>\n",
              "      <td>1454.0</td>\n",
              "      <td>326.0</td>\n",
              "      <td>624.0</td>\n",
              "      <td>262.0</td>\n",
              "      <td>1.9250</td>\n",
              "      <td>65500.0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "   longitude  latitude  housing_median_age  total_rooms  total_bedrooms  \\\n",
              "0    -114.31     34.19                15.0       5612.0          1283.0   \n",
              "1    -114.47     34.40                19.0       7650.0          1901.0   \n",
              "2    -114.56     33.69                17.0        720.0           174.0   \n",
              "3    -114.57     33.64                14.0       1501.0           337.0   \n",
              "4    -114.57     33.57                20.0       1454.0           326.0   \n",
              "\n",
              "   population  households  median_income  median_house_value  \n",
              "0      1015.0       472.0         1.4936             66900.0  \n",
              "1      1129.0       463.0         1.8200             80100.0  \n",
              "2       333.0       117.0         1.6509             85700.0  \n",
              "3       515.0       226.0         3.1917             73400.0  \n",
              "4       624.0       262.0         1.9250             65500.0  "
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 20
        }
      ]
    },
    {
      "metadata": {
        "id": "w9-Es5Y6laGd",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " *pandas* 的另一个强大功能是绘制图表。例如，借助 `DataFrame.hist`，您可以快速了解一个列中值的分布："
      ]
    },
    {
      "metadata": {
        "id": "nqndFVXVlbPN",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "base_uri": "https://localhost:8080/",
          "height": 395
        },
        "outputId": "b7cc684b-908e-490e-fa44-46e50a0e8ab9",
        "executionInfo": {
          "status": "ok",
          "timestamp": 1528027930882,
          "user_tz": -480,
          "elapsed": 1385,
          "user": {
            "displayName": "陈方杰",
            "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
            "userId": "115418685205938795893"
          }
        }
      },
      "cell_type": "code",
      "source": [
        "california_housing_dataframe.hist('housing_median_age')"
      ],
      "execution_count": 21,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "array([[<matplotlib.axes._subplots.AxesSubplot object at 0x7f0ff1858610>]],\n",
              "      dtype=object)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 21
        },
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAeoAAAFZCAYAAABXM2zhAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4yLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvNQv5yAAAIABJREFUeJzt3X1UlHX+//HXMDAH0UEEGTfLarf0\naEmaa5l4U0Iokp7IVRPWdU3q6Iqtlql499WTlajRmmZZmunRU7GNtofcAjJxyyRanT0uuu0p2VOr\neTejKCqgSPP7o9Os/FRguP1Az8dfcTEz1+d6H+3pdQ1zYfF6vV4BAAAjBTT3AgAAwPURagAADEao\nAQAwGKEGAMBghBoAAIMRagAADEaogVo6cuSI7rjjjkbdxz//+U+lpKQ06j4a0h133KEjR47o448/\n1ty5c5t7OUCrZOFz1EDtHDlyREOHDtW//vWv5l6KMe644w7l5ubqpptuau6lAK0WZ9SAn5xOp0aO\nHKn7779f27dv1w8//KA//elPio+PV3x8vNLS0lRaWipJiomJ0d69e33P/enry5cva/78+Ro2bJji\n4uI0bdo0nT9/XgUFBYqLi5MkrV69Ws8++6xSU1MVGxur0aNH6+TJk5KkgwcPaujQoRo6dKheeeUV\njRw5UgUFBdWue/Xq1Vq0aJEmT56sgQMHatasWcrLy9OoUaM0cOBA5eXlSZIuXbqk5557TsOGDVNM\nTIzWrl3re42//e1viouL0/Dhw7V+/Xrf9m3btmnixImSJI/Ho5SUFMXHxysmJkZvvfVWleN/9913\nNXr0aA0cOFDp6ek1zrusrEwzZszwrWfZsmW+71U3hx07dmjkyJGKjY3VpEmTdPr06Rr3BZiIUAN+\n+OGHH1RRUaEPPvhAc+fO1cqVK/XRRx/p008/1bZt2/TXv/5VJSUl2rhxY7Wvs3v3bh05ckTZ2dnK\nzc3V7bffrn/84x9XPS47O1vz5s3Tjh07FBERoa1bt0qSFi5cqIkTJyo3N1ft2rXTt99+W6v179q1\nSy+88II++OADZWdn+9Y9ZcoUrVu3TpK0bt06HTp0SB988IG2b9+unJwc5eXlqbKyUvPnz9eiRYv0\n0UcfKSAgQJWVlVft47XXXtNNN92k7Oxsbdq0SRkZGTp27Jjv+3//+9+VmZmprVu3asuWLTp+/Hi1\na37nnXd04cIFZWdn6/3339e2bdt8//i53hwOHz6s2bNnKyMjQ5988on69eunxYsX12pGgGkINeAH\nr9erxMREST9e9j1+/Lh27dqlxMREhYSEyGq1atSoUfr888+rfZ3w8HAVFRXp448/9p0xDho06KrH\n9e3bVzfeeKMsFot69OihY8eOqby8XAcPHtSIESMkSb/97W9V23ew7r77bkVERKhDhw6KjIzU4MGD\nJUndunXzna3n5eUpOTlZNptNISEhevjhh5Wbm6tvv/1Wly5d0sCBAyVJjzzyyDX3sWDBAi1cuFCS\n1KVLF0VGRurIkSO+748cOVJWq1WdOnVSRERElYhfy6RJk/Tqq6/KYrGoffv26tq1q44cOVLtHD79\n9FPde++96tatmyRp3Lhx2rlz5zX/YQGYLrC5FwC0JFarVW3atJEkBQQE6IcfftDp06fVvn1732Pa\nt2+vU6dOVfs6d911lxYsWKDNmzdrzpw5iomJ0aJFi656nN1ur7LvyspKnT17VhaLRaGhoZKkoKAg\nRURE1Gr9bdu2rfJ6ISEhVY5Fks6dO6elS5fqpZdekvTjpfC77rpLZ8+eVbt27aoc57UUFhb6zqID\nAgLkdrt9ry2pymv8dEzV+fbbb5Wenq7//Oc/CggI0PHjxzVq1Khq53Du3Dnt3btX8fHxVfZ75syZ\nWs8KMAWhBuqpY8eOOnPmjO/rM2fOqGPHjpKqBlCSzp496/vvn97TPnPmjObNm6c333xT0dHRNe6v\nXbt28nq9KisrU5s2bXT58uUGff/V4XBo0qRJGjJkSJXtRUVFOn/+vO/r6+1z1qxZ+v3vf6+kpCRZ\nLJZrXinwx7PPPqs777xTa9askdVq1bhx4yRVPweHw6Ho6GitWrWqXvsGTMClb6CeHnjgAWVlZams\nrEyXL1+W0+nU/fffL0mKjIzUv//9b0nShx9+qIsXL0qStm7dqjVr1kiSwsLC9Ktf/arW+2vbtq1u\nu+02ffTRR5KkzMxMWSyWBjue2NhYvffee6qsrJTX69Wrr76qTz/9VDfffLOsVqvvh7W2bdt2zf2e\nOnVKPXv2lMVi0fvvv6+ysjLfD9fVxalTp9SjRw9ZrVZ9/vnn+u6771RaWlrtHAYOHKi9e/fq8OHD\nkn782Ntzzz1X5zUAzYlQA/UUHx+vwYMHa9SoURoxYoR+8YtfaMKECZKkqVOnauPGjRoxYoSKiop0\n++23S/oxhj/9xPLw4cN16NAhPfbYY7Xe56JFi7R27Vo99NBDKi0tVadOnRos1snJyercubMeeugh\nxcfHq6ioSL/+9a8VFBSkJUuWaN68eRo+fLgsFovv0vmVpk+frtTUVI0cOVKlpaV69NFHtXDhQv33\nv/+t03r+8Ic/aNmyZRoxYoS+/PJLTZs2TatXr9a+ffuuOweHw6ElS5YoNTVVw4cP17PPPquEhIT6\njgZoFnyOGmihvF6vL8733XefNm7cqO7duzfzqpoec0Brxxk10AL98Y9/9H2cKj8/X16vV7feemvz\nLqoZMAf8HHBGDbRARUVFmjt3rs6ePaugoCDNmjVLN910k1JTU6/5+Ntuu833nrhpioqK6rzua83h\np58PAFoLQg0AgMG49A0AgMEINQAABjPyhidu9zm/Ht+hQ4iKi+v+Oc2fO+ZXd8yufphf3TG7+jFt\nfpGR9ut+r1WcUQcGWpt7CS0a86s7Zlc/zK/umF39tKT5tYpQAwDQWhFqAAAMRqgBADBYjT9MVlZW\nprS0NJ06dUoXL17U1KlT1b17d82ePVuVlZWKjIzUihUrZLPZlJWVpU2bNikgIEBjx47VmDFjVFFR\nobS0NB09elRWq1VLly5Vly5dmuLYAABo8Wo8o87Ly1PPnj21ZcsWrVy5Uunp6Vq1apWSk5P19ttv\n65ZbbpHT6VRpaanWrFmjjRs3avPmzdq0aZPOnDmj7du3KzQ0VO+8846mTJmijIyMpjguAABahRpD\nnZCQoCeeeEKSdOzYMXXq1EkFBQWKjY2VJA0ZMkT5+fnav3+/oqKiZLfbFRwcrD59+sjlcik/P19x\ncXGSpOjoaLlcrkY8HAAAWpdaf4563LhxOn78uNauXavHHntMNptNkhQRESG32y2Px6Pw8HDf48PD\nw6/aHhAQIIvFokuXLvmeDwAArq/WoX733Xf11VdfadasWbry9uDXu1W4v9uv1KFDiN+fcavuw+Ko\nGfOrO2ZXP8yv7phd/bSU+dUY6gMHDigiIkI33HCDevToocrKSrVt21bl5eUKDg7WiRMn5HA45HA4\n5PF4fM87efKkevfuLYfDIbfbre7du6uiokJer7fGs2l/7xYTGWn3+25m+B/mV3fMrn6YX90xu/ox\nbX71ujPZ3r17tWHDBkmSx+NRaWmpoqOjlZOTI0nKzc3VoEGD1KtXLxUWFqqkpEQXLlyQy+VS3759\nNWDAAGVnZ0v68QfT+vXr1xDHBADAz0KNZ9Tjxo3T/PnzlZycrPLycv3f//2fevbsqTlz5igzM1Od\nO3dWYmKigoKCNHPmTKWkpMhisSg1NVV2u10JCQnas2ePkpKSZLPZlJ6e3hTHBQBAq2Dk76P293KE\naZcwWhrmV3fMrn6YX90xu/oxbX7VXfo28rdnAcC1TErf2dxLqNGGtJjmXgJaGW4hCgCAwQg1AAAG\nI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCA\nwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMA\nYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QA\nABiMUAMAYDBCDQCAwQg1AAAGC6zNg5YvX659+/bp8uXLmjx5snbu3KmDBw8qLCxMkpSSkqIHHnhA\nWVlZ2rRpkwICAjR27FiNGTNGFRUVSktL09GjR2W1WrV06VJ16dKlUQ8KAIDWosZQf/HFF/rmm2+U\nmZmp4uJiPfLII7rvvvv09NNPa8iQIb7HlZaWas2aNXI6nQoKCtLo0aMVFxenvLw8hYaGKiMjQ7t3\n71ZGRoZWrlzZqAcFAEBrUeOl73vuuUcvv/yyJCk0NFRlZWWqrKy86nH79+9XVFSU7Ha7goOD1adP\nH7lcLuXn5ysuLk6SFB0dLZfL1cCHAABA61VjqK1Wq0JCQiRJTqdTgwcPltVq1ZYtWzRhwgQ99dRT\nOn36tDwej8LDw33PCw8Pl9vtrrI9ICBAFotFly5daqTDAQCgdanVe9SStGPHDjmdTm3YsEEHDhxQ\nWFiYevTooTfeeEOvvPKK7r777iqP93q913yd622/UocOIQoMtNZ2aZKkyEi7X49HVcyv7phd/bS2\n+TXl8bS22TW1ljK/WoX6s88+09q1a7V+/XrZ7Xb179/f972YmBgtXrxYw4YNk8fj8W0/efKkevfu\nLYfDIbfbre7du6uiokJer1c2m63a/RUXl/p1EJGRdrnd5/x6Dv6H+dUds6uf1ji/pjqe1ji7pmTa\n/Kr7R0ONl77PnTun5cuX6/XXX/f9lPeTTz6pw4cPS5IKCgrUtWtX9erVS4WFhSopKdGFCxfkcrnU\nt29fDRgwQNnZ2ZKkvLw89evXryGOCQCAn4Uaz6g//PBDFRcXa8aMGb5to0aN0owZM9SmTRuFhIRo\n6dKlCg4O1syZM5WSkiKLxaLU1FTZ7XYlJCRoz549SkpKks1mU3p6eqMeEAAArYnFW5s3jZuYv5cj\nTLuE0dIwv7pjdvXj7/wmpe9sxNU0jA1pMU2yH/7s1Y9p86vXpW8AANB8CDUAAAYj1AAAGIxQAwBg\nMEINAIDBCDUAAAYj1AAAGIxQAwBgMEINAIDBCDUAAAYj1AAAGIxQAwBgMEINAIDBCDUAAAYj1AAA\nGIxQAwBgMEINAIDBCDUAAAYj1AAAGIxQAwBgMEINAIDBCDUAAAYLbO4FAA1lUvrO5l5CtTakxTT3\nEgC0QJxRAwBgMEINAIDBCDUAAAYj1AAAGIxQAwBgMEINAIDBCDUAAAYj1AAAGIxQAwBgMEINAIDB\nCDUAAAYj1AAAGIxQAwBgMEINAIDBCDUAAAbj91EDTcT035ct8TuzARNxRg0AgMFqdUa9fPly7du3\nT5cvX9bkyZMVFRWl2bNnq7KyUpGRkVqxYoVsNpuysrK0adMmBQQEaOzYsRozZowqKiqUlpamo0eP\nymq1aunSperSpUtjHxcAAK1CjaH+4osv9M033ygzM1PFxcV65JFH1L9/fyUnJ2v48OF66aWX5HQ6\nlZiYqDVr1sjpdCooKEijR49WXFyc8vLyFBoaqoyMDO3evVsZGRlauXJlUxwbAAAtXo2Xvu+55x69\n/PLLkqTQ0FCVlZWpoKBAsbGxkqQhQ4YoPz9f+/fvV1RUlOx2u4KDg9WnTx+5XC7l5+crLi5OkhQd\nHS2Xy9WIhwMAQOtS4xm11WpVSEiIJMnpdGrw4MHavXu3bDabJCkiIkJut1sej0fh4eG+54WHh1+1\nPSAgQBaLRZcuXfI9/1o6dAhRYKDVrwOJjLT79XhUxfwgNc+fg9b2Z68pj6e1za6ptZT51fqnvnfs\n2CGn06kNGzZo6NChvu1er/eaj/d3+5WKi0truyxJPw7b7T7n13PwP8wPP2nqPwet8c9eUx1Pa5xd\nUzJtftX9o6FWP/X92Wefae3atVq3bp3sdrtCQkJUXl4uSTpx4oQcDoccDoc8Ho/vOSdPnvRtd7vd\nkqSKigp5vd5qz6YBAMD/1Bjqc+fOafny5Xr99dcVFhYm6cf3mnNyciRJubm5GjRokHr16qXCwkKV\nlJTowoULcrlc6tu3rwYMGKDs7GxJUl5envr169eIhwMAQOtS46XvDz/8UMXFxZoxY4ZvW3p6uhYs\nWKDMzEx17txZiYmJCgoK0syZM5WSkiKLxaLU1FTZ7XYlJCRoz549SkpKks1mU3p6eqMeEAAArUmN\noX700Uf16KOPXrX9rbfeumpbfHy84uPjq2z76bPTAADAf9xCFIBPS7jNKfBzwy1EAQAwGKEGAMBg\nhBoAAIMRagAADEaoAQAwGKEGAMBghBoAAIMRagAADEaoAQAwGHcmQ61wxyoAaB6cUQMAYDBCDQCA\nwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMA\nYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABgssLkXAADAlSal72zuJdRoQ1pM\nk+2LM2oAAAxGqAEAMBihBgDAYIQaAACDEWoAAAxGqAEAMBihBgDAYLX6HPXXX3+tqVOnauLEiRo/\nfrzS0tJ08OBBhYWFSZJSUlL0wAMPKCsrS5s2bVJAQIDGjh2rMWPGqKKiQmlpaTp69KisVquWLl2q\nLl26NOpBAUBz4TPAaGg1hrq0tFRLlixR//79q2x/+umnNWTIkCqPW7NmjZxOp4KCgjR69GjFxcUp\nLy9PoaGhysjI0O7du5WRkaGVK1c2/JEAANAK1Xjp22azad26dXI4HNU+bv/+/YqKipLdbldwcLD6\n9Okjl8ul/Px8xcXFSZKio6PlcrkaZuUAAPwM1BjqwMBABQcHX7V9y5YtmjBhgp566imdPn1aHo9H\n4eHhvu+Hh4fL7XZX2R4QECCLxaJLly414CEAANB61ele3w8//LDCwsLUo0cPvfHGG3rllVd09913\nV3mM1+u95nOvt/1KHTqEKDDQ6teaIiPtfj0eVTE/4OeDv+/115QzrFOor3y/OiYmRosXL9awYcPk\n8Xh820+ePKnevXvL4XDI7Xare/fuqqiokNfrlc1mq/b1i4tL/VpPZKRdbvc5/w4CPswP+Hnh73v9\nNfQMqwt/nT6e9eSTT+rw4cOSpIKCAnXt2lW9evVSYWGhSkpKdOHCBblcLvXt21cDBgxQdna2JCkv\nL0/9+vWryy4BAPhZqvGM+sCBA1q2bJm+//57BQYGKicnR+PHj9eMGTPUpk0bhYSEaOnSpQoODtbM\nmTOVkpIii8Wi1NRU2e12JSQkaM+ePUpKSpLNZlN6enpTHBcAAK1CjaHu2bOnNm/efNX2YcOGXbUt\nPj5e8fHxVbb99NlpAADgP+5MBgCAwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMA\nYDBCDQCAwQg1AAAGI9QAABiMUAMAYLA6/T5qAEDLNSl9Z3MvAX7gjBoAAIMRagAADEaoAQAwGKEG\nAMBghBoAAIMRagAADEaoAQAwGKEGAMBghBoAAIMRagAADEaoAQAwGKEGAMBghBoAAIMRagAADEao\nAQAwGKEGAMBghBoAAIMRagAADEaoAQAwGKEGAMBghBoAAIMRagAADEaoAQAwGKEGAMBghBoAAIMR\nagAADFarUH/99dd68MEHtWXLFknSsWPH9Lvf/U7JycmaPn26Ll26JEnKysrSb37zG40ZM0bvvfee\nJKmiokIzZ85UUlKSxo8fr8OHDzfSoQAA0PrUGOrS0lItWbJE/fv3921btWqVkpOT9fbbb+uWW26R\n0+lUaWmp1qxZo40bN2rz5s3atGmTzpw5o+3btys0NFTvvPOOpkyZooyMjEY9IAAAWpMaQ22z2bRu\n3To5HA7ftoKCAsXGxkqShgwZovz8fO3fv19RUVGy2+0KDg5Wnz595HK5lJ+fr7i4OElSdHS0XC5X\nIx0KAACtT42hDgwMVHBwcJVtZWVlstlskqSIiAi53W55PB6Fh4f7HhMeHn7V9oCAAFksFt+lcgAA\nUL3A+r6A1+ttkO1X6tAhRIGBVr/WERlp9+vxqIr5AUDtNeX/M+sU6pCQEJWXlys4OFgnTpyQw+GQ\nw+GQx+PxPebkyZPq3bu3HA6H3G63unfvroqKCnm9Xt/Z+PUUF5f6tZ7ISLvc7nN1ORSI+QGAvxr6\n/5nVhb9OH8+Kjo5WTk6OJCk3N1eDBg1Sr169VFhYqJKSEl24cEEul0t9+/bVgAEDlJ2dLUnKy8tT\nv3796rJLAAB+lmo8oz5w4ICWLVum77//XoGBgcrJydGLL76otLQ0ZWZmqnPnzkpMTFRQUJBmzpyp\nlJQUWSwWpaamym63KyEhQXv27FFSUpJsNpvS09Ob4rgAAGgVLN7avGncxPy9pMCl2/qpzfwmpe9s\notUAgPk2pMU06Os1+KVvAADQNOr9U99oGJyxAgCuhTNqAAAMRqgBADAYoQYAwGCEGgAAgxFqAAAM\nRqgBADAYoQYAwGCEGgAAgxFqAAAMRqgBADAYoQYAwGCEGgAAgxFqAAAMRqgBADAYoQYAwGCEGgAA\ngxFqAAAMRqgBADAYoQYAwGCEGgAAgxFqAAAMRqgBADAYoQYAwGCEGgAAgxFqAAAMRqgBADAYoQYA\nwGCEGgAAgxFqAAAMRqgBADAYoQYAwGCEGgAAgxFqAAAMFtjcC2gKk9J3NvcSAACoE86oAQAwGKEG\nAMBghBoAAIMRagAADFanHyYrKCjQ9OnT1bVrV0lSt27d9Pjjj2v27NmqrKxUZGSkVqxYIZvNpqys\nLG3atEkBAQEaO3asxowZ06AHAABAa1bnn/q+9957tWrVKt/Xc+fOVXJysoYPH66XXnpJTqdTiYmJ\nWrNmjZxOp4KCgjR69GjFxcUpLCysQRYPAEBr12CXvgsKChQbGytJGjJkiPLz87V//35FRUXJbrcr\nODhYffr0kcvlaqhdAgDQ6tX5jPrQoUOaMmWKzp49q2nTpqmsrEw2m02SFBERIbfbLY/Ho/DwcN9z\nwsPD5Xa7a3ztDh1CFBho9Ws9kZF2/w4AAIA6asrm1CnUt956q6ZNm6bhw4fr8OHDmjBhgiorK33f\n93q913ze9bb//4qLS/1aT2SkXW73Ob+eAwBAXTV0c6oLf50ufXfq1EkJCQmyWCy6+eab1bFjR509\ne1bl5eWSpBMnTsjhcMjhcMjj8fied/LkSTkcjrrsEgCAn6U6hTorK0tvvvmmJMntduvUqVMaNWqU\ncnJyJEm5ubkaNGiQevXqpcLCQpWUlOjChQtyuVzq27dvw60eAIBWrk6XvmNiYvTMM8/ok08+UUVF\nhRYvXqwePXpozpw5yszMVOfOnZWYmKigoCDNnDlTKSkpslgsSk1Nld3Oe8kAANSWxVvbN46bkL/X\n/mt6j5pfygEAaEgb0mIa9PUa/D1qAADQNAg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiM\nUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAG\nI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCA\nwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABgssCl2\n8sILL2j//v2yWCyaN2+e7rrrrqbYLQAALV6jh/rLL7/Ud999p8zMTBUVFWnevHnKzMxs7N0CANAq\nNPql7/z8fD344IOSpNtuu01nz57V+fPnG3u3AAC0Co0eao/How4dOvi+Dg8Pl9vtbuzdAgDQKjTJ\ne9RX8nq9NT4mMtLu9+tW95wPMh72+/UAADBBo59ROxwOeTwe39cnT55UZGRkY+8WAIBWodFDPWDA\nAOXk5EiSDh48KIfDoXbt2jX2bgEAaBUa/dJ3nz59dOedd2rcuHGyWCxatGhRY+8SAIBWw+KtzZvG\nAACgWXBnMgAADEaoAQAwWJN/PKuhcXtS/3399deaOnWqJk6cqPHjx+vYsWOaPXu2KisrFRkZqRUr\nVshmszX3Mo20fPly7du3T5cvX9bkyZMVFRXF7GqhrKxMaWlpOnXqlC5evKipU6eqe/fuzM5P5eXl\nGjFihKZOnar+/fszv1oqKCjQ9OnT1bVrV0lSt27d9Pjjj7eY+bXoM+orb0/6/PPP6/nnn2/uJRmv\ntLRUS5YsUf/+/X3bVq1apeTkZL399tu65ZZb5HQ6m3GF5vriiy/0zTffKDMzU+vXr9cLL7zA7Gop\nLy9PPXv21JYtW7Ry5Uqlp6czuzp47bXX1L59e0n8vfXXvffeq82bN2vz5s1auHBhi5pfiw41tyf1\nn81m07p16+RwOHzbCgoKFBsbK0kaMmSI8vPzm2t5Rrvnnnv08ssvS5JCQ0NVVlbG7GopISFBTzzx\nhCTp2LFj6tSpE7PzU1FRkQ4dOqQHHnhAEn9v66slza9Fh5rbk/ovMDBQwcHBVbaVlZX5LvlEREQw\nw+uwWq0KCQmRJDmdTg0ePJjZ+WncuHF65plnNG/ePGbnp2XLliktLc33NfPzz6FDhzRlyhQlJSXp\n888/b1Hza/HvUV+JT5rVHzOs2Y4dO+R0OrVhwwYNHTrUt53Z1ezdd9/VV199pVmzZlWZF7Or3l/+\n8hf17t1bXbp0ueb3mV/1br31Vk2bNk3Dhw/X4cOHNWHCBFVWVvq+b/r8WnSouT1pwwgJCVF5ebmC\ng4N14sSJKpfFUdVnn32mtWvXav369bLb7cyulg4cOKCIiAjdcMMN6tGjhyorK9W2bVtmV0u7du3S\n4cOHtWvXLh0/flw2m40/e37o1KmTEhISJEk333yzOnbsqMLCwhYzvxZ96ZvbkzaM6Oho3xxzc3M1\naNCgZl6Rmc6dO6fly5fr9ddfV1hYmCRmV1t79+7Vhg0bJP34llVpaSmz88PKlSu1detW/fnPf9aY\nMWM0depU5ueHrKwsvfnmm5Ikt9utU6dOadSoUS1mfi3+zmQvvvii9u7d67s9affu3Zt7SUY7cOCA\nli1bpu+//16BgYHq1KmTXnytKYqYAAAArElEQVTxRaWlpenixYvq3Lmzli5dqqCgoOZeqnEyMzO1\nevVq/fKXv/RtS09P14IFC5hdDcrLyzV//nwdO3ZM5eXlmjZtmnr27Kk5c+YwOz+tXr1aN954owYO\nHMj8aun8+fN65plnVFJSooqKCk2bNk09evRoMfNr8aEGAKA1a9GXvgEAaO0INQAABiPUAAAYjFAD\nAGAwQg0AgMEINQAABiPUAAAYjFADAGCw/wdkB5RjykY3PgAAAABJRU5ErkJggg==\n",
            "text/plain": [
              "<matplotlib.figure.Figure at 0x7f0ff3977b90>"
            ]
          },
          "metadata": {
            "tags": []
          }
        }
      ]
    },
    {
      "metadata": {
        "id": "XtYZ7114n3b-",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ## 访问数据\n",
        "\n",
        "您可以使用熟悉的 Python dict/list 指令访问 `DataFrame` 数据："
      ]
    },
    {
      "metadata": {
        "id": "_TFm7-looBFF",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "base_uri": "https://localhost:8080/",
          "height": 101
        },
        "outputId": "5676d59f-377e-4944-f125-b8c32242b145",
        "executionInfo": {
          "status": "ok",
          "timestamp": 1528027935465,
          "user_tz": -480,
          "elapsed": 1353,
          "user": {
            "displayName": "陈方杰",
            "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
            "userId": "115418685205938795893"
          }
        }
      },
      "cell_type": "code",
      "source": [
        "cities = pd.DataFrame({ 'City name': city_names, 'Population': population })\n",
        "print type(cities['City name'])\n",
        "cities['City name']"
      ],
      "execution_count": 22,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "<class 'pandas.core.series.Series'>\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "0    San Francisco\n",
              "1         San Jose\n",
              "2       Sacramento\n",
              "Name: City name, dtype: object"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 22
        }
      ]
    },
    {
      "metadata": {
        "id": "V5L6xacLoxyv",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "base_uri": "https://localhost:8080/",
          "height": 50
        },
        "outputId": "77170b83-33cd-47b1-9db4-03a825e34d1e",
        "executionInfo": {
          "status": "ok",
          "timestamp": 1528027938449,
          "user_tz": -480,
          "elapsed": 1241,
          "user": {
            "displayName": "陈方杰",
            "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
            "userId": "115418685205938795893"
          }
        }
      },
      "cell_type": "code",
      "source": [
        "print type(cities['City name'][1])\n",
        "cities['City name'][1]"
      ],
      "execution_count": 23,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "<type 'str'>\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "'San Jose'"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 23
        }
      ]
    },
    {
      "metadata": {
        "id": "gcYX1tBPugZl",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "base_uri": "https://localhost:8080/",
          "height": 123
        },
        "outputId": "ce5ad5e9-6509-4da0-fc82-e867a55b482a",
        "executionInfo": {
          "status": "ok",
          "timestamp": 1528027941171,
          "user_tz": -480,
          "elapsed": 1156,
          "user": {
            "displayName": "陈方杰",
            "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
            "userId": "115418685205938795893"
          }
        }
      },
      "cell_type": "code",
      "source": [
        "print type(cities[0:2])\n",
        "cities[0:2]"
      ],
      "execution_count": 24,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "<class 'pandas.core.frame.DataFrame'>\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>City name</th>\n",
              "      <th>Population</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>San Francisco</td>\n",
              "      <td>852469</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>San Jose</td>\n",
              "      <td>1015785</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "       City name  Population\n",
              "0  San Francisco      852469\n",
              "1       San Jose     1015785"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 24
        }
      ]
    },
    {
      "metadata": {
        "id": "65g1ZdGVjXsQ",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " 此外，*pandas* 针对高级[索引和选择](http://pandas.pydata.org/pandas-docs/stable/indexing.html)提供了极其丰富的 API（数量过多，此处无法逐一列出）。"
      ]
    },
    {
      "metadata": {
        "id": "RM1iaD-ka3Y1",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ## 操控数据\n",
        "\n",
        "您可以向 `Series` 应用 Python 的基本运算指令。例如："
      ]
    },
    {
      "metadata": {
        "id": "XWmyCFJ5bOv-",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "base_uri": "https://localhost:8080/",
          "height": 84
        },
        "outputId": "7de9ada2-e463-49e0-d76c-c356980d9619",
        "executionInfo": {
          "status": "ok",
          "timestamp": 1528027944724,
          "user_tz": -480,
          "elapsed": 1342,
          "user": {
            "displayName": "陈方杰",
            "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
            "userId": "115418685205938795893"
          }
        }
      },
      "cell_type": "code",
      "source": [
        "population / 1000."
      ],
      "execution_count": 25,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "0     852.469\n",
              "1    1015.785\n",
              "2     485.199\n",
              "dtype: float64"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 25
        }
      ]
    },
    {
      "metadata": {
        "id": "TQzIVnbnmWGM",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " [NumPy](http://www.numpy.org/) 是一种用于进行科学计算的常用工具包。*pandas* `Series` 可用作大多数 NumPy 函数的参数："
      ]
    },
    {
      "metadata": {
        "id": "ko6pLK6JmkYP",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "base_uri": "https://localhost:8080/",
          "height": 84
        },
        "outputId": "7534bbe4-e719-4244-e0cf-e63da3d9993a",
        "executionInfo": {
          "status": "ok",
          "timestamp": 1528027947332,
          "user_tz": -480,
          "elapsed": 1172,
          "user": {
            "displayName": "陈方杰",
            "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
            "userId": "115418685205938795893"
          }
        }
      },
      "cell_type": "code",
      "source": [
        "import numpy as np\n",
        "\n",
        "np.log(population)"
      ],
      "execution_count": 26,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "0    13.655892\n",
              "1    13.831172\n",
              "2    13.092314\n",
              "dtype: float64"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 26
        }
      ]
    },
    {
      "metadata": {
        "id": "xmxFuQmurr6d",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " 对于更复杂的单列转换，您可以使用 `Series.apply`。像 Python [映射函数](https://docs.python.org/2/library/functions.html#map)一样，`Series.apply` 将以参数形式接受 [lambda 函数](https://docs.python.org/2/tutorial/controlflow.html#lambda-expressions)，而该函数会应用于每个值。\n",
        "\n",
        "下面的示例创建了一个指明 `population` 是否超过 100 万的新 `Series`："
      ]
    },
    {
      "metadata": {
        "id": "Fc1DvPAbstjI",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "base_uri": "https://localhost:8080/",
          "height": 84
        },
        "outputId": "55250445-9035-4fbb-d6f7-de838d4e2d2d",
        "executionInfo": {
          "status": "ok",
          "timestamp": 1528027949958,
          "user_tz": -480,
          "elapsed": 1337,
          "user": {
            "displayName": "陈方杰",
            "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
            "userId": "115418685205938795893"
          }
        }
      },
      "cell_type": "code",
      "source": [
        "population.apply(lambda val: val > 1000000)"
      ],
      "execution_count": 27,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "0    False\n",
              "1     True\n",
              "2    False\n",
              "dtype: bool"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 27
        }
      ]
    },
    {
      "metadata": {
        "id": "6qh63m-ayb-c",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ## 练习 1\n",
        "\n",
        "通过添加一个新的布尔值列（当且仅当以下*两项*均为 True 时为 True）修改 `cities` 表格：\n",
        "\n",
        "  * 城市以圣人命名。\n",
        "  * 城市面积大于 50 平方英里。\n",
        "\n",
        "**注意：**布尔值 `Series` 是使用“按位”而非传统布尔值“运算符”组合的。例如，执行*逻辑与*时，应使用 `&`，而不是 `and`。\n",
        "\n",
        "**提示：**\"San\" 在西班牙语中意为 \"saint\"。"
      ]
    },
    {
      "metadata": {
        "id": "ZeYYLoV9b9fB",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " \n",
        "`DataFrames` 的修改方式也非常简单。例如，以下代码向现有 `DataFrame` 添加了两个 `Series`："
      ]
    },
    {
      "metadata": {
        "id": "0gCEX99Hb8LR",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "base_uri": "https://localhost:8080/",
          "height": 136
        },
        "outputId": "5f7bff59-cef4-4b20-83d0-1d98b1ccd6c3",
        "executionInfo": {
          "status": "ok",
          "timestamp": 1528027952256,
          "user_tz": -480,
          "elapsed": 1152,
          "user": {
            "displayName": "陈方杰",
            "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
            "userId": "115418685205938795893"
          }
        }
      },
      "cell_type": "code",
      "source": [
        "cities['Area square miles'] = pd.Series([46.87, 176.53, 97.92])\n",
        "cities['Population density'] = cities['Population'] / cities['Area square miles']\n",
        "cities"
      ],
      "execution_count": 28,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>City name</th>\n",
              "      <th>Population</th>\n",
              "      <th>Area square miles</th>\n",
              "      <th>Population density</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>San Francisco</td>\n",
              "      <td>852469</td>\n",
              "      <td>46.87</td>\n",
              "      <td>18187.945381</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>San Jose</td>\n",
              "      <td>1015785</td>\n",
              "      <td>176.53</td>\n",
              "      <td>5754.177760</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>Sacramento</td>\n",
              "      <td>485199</td>\n",
              "      <td>97.92</td>\n",
              "      <td>4955.055147</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "       City name  Population  Area square miles  Population density\n",
              "0  San Francisco      852469              46.87        18187.945381\n",
              "1       San Jose     1015785             176.53         5754.177760\n",
              "2     Sacramento      485199              97.92         4955.055147"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 28
        }
      ]
    },
    {
      "metadata": {
        "id": "zCOn8ftSyddH",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "base_uri": "https://localhost:8080/",
          "height": 136
        },
        "outputId": "3f55dfb8-faa8-48a5-a8d8-c320df9a262a",
        "executionInfo": {
          "status": "ok",
          "timestamp": 1528028043662,
          "user_tz": -480,
          "elapsed": 1341,
          "user": {
            "displayName": "陈方杰",
            "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
            "userId": "115418685205938795893"
          }
        }
      },
      "cell_type": "code",
      "source": [
        "# Your code here\n",
        "# [2018-06-03] Amusi add:\n",
        "# 布尔值 Series 是使用“按位”而非传统布尔值“运算符”组合的。例如，执行逻辑与时，应使用 &，而不是 and。\n",
        "# 注: Panda series apply使用说明：http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.apply.html\n",
        "\n",
        "cities['Is wide and has saint name'] = (cities['Area square miles'] > 50) & cities['City name'] .apply(lambda name: name.startswith('San'))\n",
        "cities"
      ],
      "execution_count": 32,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>City name</th>\n",
              "      <th>Population</th>\n",
              "      <th>Area square miles</th>\n",
              "      <th>Population density</th>\n",
              "      <th>Is wide and has saint name</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>San Francisco</td>\n",
              "      <td>852469</td>\n",
              "      <td>46.87</td>\n",
              "      <td>18187.945381</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>San Jose</td>\n",
              "      <td>1015785</td>\n",
              "      <td>176.53</td>\n",
              "      <td>5754.177760</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>Sacramento</td>\n",
              "      <td>485199</td>\n",
              "      <td>97.92</td>\n",
              "      <td>4955.055147</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "       City name  Population  Area square miles  Population density  \\\n",
              "0  San Francisco      852469              46.87        18187.945381   \n",
              "1       San Jose     1015785             176.53         5754.177760   \n",
              "2     Sacramento      485199              97.92         4955.055147   \n",
              "\n",
              "   Is wide and has saint name  \n",
              "0                       False  \n",
              "1                        True  \n",
              "2                       False  "
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 32
        }
      ]
    },
    {
      "metadata": {
        "id": "YHIWvc9Ms-Ll",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ### 解决方案\n",
        "\n",
        "点击下方，查看解决方案。"
      ]
    },
    {
      "metadata": {
        "id": "T5OlrqtdtCIb",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "base_uri": "https://localhost:8080/",
          "height": 136
        },
        "outputId": "1212940f-9af4-4039-d03a-ae2fd4d72e3b",
        "executionInfo": {
          "status": "ok",
          "timestamp": 1528028019617,
          "user_tz": -480,
          "elapsed": 1848,
          "user": {
            "displayName": "陈方杰",
            "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
            "userId": "115418685205938795893"
          }
        }
      },
      "cell_type": "code",
      "source": [
        "cities['Is wide and has saint name'] = (cities['Area square miles'] > 50) & cities['City name'].apply(lambda name: name.startswith('San'))\n",
        "cities"
      ],
      "execution_count": 31,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>City name</th>\n",
              "      <th>Population</th>\n",
              "      <th>Area square miles</th>\n",
              "      <th>Population density</th>\n",
              "      <th>Is wide and has saint name</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>San Francisco</td>\n",
              "      <td>852469</td>\n",
              "      <td>46.87</td>\n",
              "      <td>18187.945381</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>San Jose</td>\n",
              "      <td>1015785</td>\n",
              "      <td>176.53</td>\n",
              "      <td>5754.177760</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>Sacramento</td>\n",
              "      <td>485199</td>\n",
              "      <td>97.92</td>\n",
              "      <td>4955.055147</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "       City name  Population  Area square miles  Population density  \\\n",
              "0  San Francisco      852469              46.87        18187.945381   \n",
              "1       San Jose     1015785             176.53         5754.177760   \n",
              "2     Sacramento      485199              97.92         4955.055147   \n",
              "\n",
              "   Is wide and has saint name  \n",
              "0                       False  \n",
              "1                        True  \n",
              "2                       False  "
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 31
        }
      ]
    },
    {
      "metadata": {
        "id": "f-xAOJeMiXFB",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ## 索引\n",
        "`Series` 和 `DataFrame` 对象也定义了 `index` 属性，该属性会向每个 `Series` 项或 `DataFrame` 行赋一个标识符值。\n",
        "\n",
        "默认情况下，在构造时，*pandas* 会赋可反映源数据顺序的索引值。索引值在创建后是稳定的；也就是说，它们不会因为数据重新排序而发生改变。"
      ]
    },
    {
      "metadata": {
        "id": "2684gsWNinq9",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "base_uri": "https://localhost:8080/",
          "height": 34
        },
        "outputId": "3c3fa21d-ba90-4db6-888d-cbd1df2028ed",
        "executionInfo": {
          "status": "ok",
          "timestamp": 1528028079282,
          "user_tz": -480,
          "elapsed": 1359,
          "user": {
            "displayName": "陈方杰",
            "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
            "userId": "115418685205938795893"
          }
        }
      },
      "cell_type": "code",
      "source": [
        "city_names.index"
      ],
      "execution_count": 33,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "RangeIndex(start=0, stop=3, step=1)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 33
        }
      ]
    },
    {
      "metadata": {
        "id": "F_qPe2TBjfWd",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "base_uri": "https://localhost:8080/",
          "height": 34
        },
        "outputId": "5364feec-d2c4-425e-e09d-b8a0590b1ca5",
        "executionInfo": {
          "status": "ok",
          "timestamp": 1528028086703,
          "user_tz": -480,
          "elapsed": 1169,
          "user": {
            "displayName": "陈方杰",
            "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
            "userId": "115418685205938795893"
          }
        }
      },
      "cell_type": "code",
      "source": [
        "cities.index"
      ],
      "execution_count": 34,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "RangeIndex(start=0, stop=3, step=1)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 34
        }
      ]
    },
    {
      "metadata": {
        "id": "hp2oWY9Slo_h",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " 调用 `DataFrame.reindex` 以手动重新排列各行的顺序。例如，以下方式与按城市名称排序具有相同的效果："
      ]
    },
    {
      "metadata": {
        "id": "sN0zUzSAj-U1",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "base_uri": "https://localhost:8080/",
          "height": 136
        },
        "outputId": "34406cf9-eefb-4dab-8eb0-3bf677de2c69",
        "executionInfo": {
          "status": "ok",
          "timestamp": 1528028093432,
          "user_tz": -480,
          "elapsed": 1183,
          "user": {
            "displayName": "陈方杰",
            "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
            "userId": "115418685205938795893"
          }
        }
      },
      "cell_type": "code",
      "source": [
        "cities.reindex([2, 0, 1])"
      ],
      "execution_count": 35,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>City name</th>\n",
              "      <th>Population</th>\n",
              "      <th>Area square miles</th>\n",
              "      <th>Population density</th>\n",
              "      <th>Is wide and has saint name</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>Sacramento</td>\n",
              "      <td>485199</td>\n",
              "      <td>97.92</td>\n",
              "      <td>4955.055147</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>San Francisco</td>\n",
              "      <td>852469</td>\n",
              "      <td>46.87</td>\n",
              "      <td>18187.945381</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>San Jose</td>\n",
              "      <td>1015785</td>\n",
              "      <td>176.53</td>\n",
              "      <td>5754.177760</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "       City name  Population  Area square miles  Population density  \\\n",
              "2     Sacramento      485199              97.92         4955.055147   \n",
              "0  San Francisco      852469              46.87        18187.945381   \n",
              "1       San Jose     1015785             176.53         5754.177760   \n",
              "\n",
              "   Is wide and has saint name  \n",
              "2                       False  \n",
              "0                       False  \n",
              "1                        True  "
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 35
        }
      ]
    },
    {
      "metadata": {
        "id": "-GQFz8NZuS06",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " 重建索引是一种随机排列 `DataFrame` 的绝佳方式。在下面的示例中，我们会取用类似数组的索引，然后将其传递至 NumPy 的 `random.permutation` 函数，该函数会随机排列其值的位置。如果使用此重新随机排列的数组调用 `reindex`，会导致 `DataFrame` 行以同样的方式随机排列。\n",
        "尝试多次运行以下单元格！"
      ]
    },
    {
      "metadata": {
        "id": "mF8GC0k8uYhz",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "base_uri": "https://localhost:8080/",
          "height": 136
        },
        "outputId": "0722ce64-deb6-42c5-b2cc-e0ab1e1ea44d",
        "executionInfo": {
          "status": "ok",
          "timestamp": 1528028109548,
          "user_tz": -480,
          "elapsed": 1322,
          "user": {
            "displayName": "陈方杰",
            "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
            "userId": "115418685205938795893"
          }
        }
      },
      "cell_type": "code",
      "source": [
        "cities.reindex(np.random.permutation(cities.index))"
      ],
      "execution_count": 36,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>City name</th>\n",
              "      <th>Population</th>\n",
              "      <th>Area square miles</th>\n",
              "      <th>Population density</th>\n",
              "      <th>Is wide and has saint name</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>San Francisco</td>\n",
              "      <td>852469</td>\n",
              "      <td>46.87</td>\n",
              "      <td>18187.945381</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>Sacramento</td>\n",
              "      <td>485199</td>\n",
              "      <td>97.92</td>\n",
              "      <td>4955.055147</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>San Jose</td>\n",
              "      <td>1015785</td>\n",
              "      <td>176.53</td>\n",
              "      <td>5754.177760</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "       City name  Population  Area square miles  Population density  \\\n",
              "0  San Francisco      852469              46.87        18187.945381   \n",
              "2     Sacramento      485199              97.92         4955.055147   \n",
              "1       San Jose     1015785             176.53         5754.177760   \n",
              "\n",
              "   Is wide and has saint name  \n",
              "0                       False  \n",
              "2                       False  \n",
              "1                        True  "
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 36
        }
      ]
    },
    {
      "metadata": {
        "id": "fSso35fQmGKb",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " 有关详情，请参阅[索引文档](http://pandas.pydata.org/pandas-docs/stable/indexing.html#index-objects)。"
      ]
    },
    {
      "metadata": {
        "id": "8UngIdVhz8C0",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ## 练习 2\n",
        "\n",
        "`reindex` 方法允许使用未包含在原始 `DataFrame` 索引值中的索引值。请试一下，看看如果使用此类值会发生什么！您认为允许此类值的原因是什么？"
      ]
    },
    {
      "metadata": {
        "id": "PN55GrDX0jzO",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "base_uri": "https://localhost:8080/",
          "height": 195
        },
        "outputId": "37b9adae-5781-401c-d4b2-63fa8069ba98",
        "executionInfo": {
          "status": "ok",
          "timestamp": 1528028246239,
          "user_tz": -480,
          "elapsed": 1188,
          "user": {
            "displayName": "陈方杰",
            "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
            "userId": "115418685205938795893"
          }
        }
      },
      "cell_type": "code",
      "source": [
        "# Your code here\n",
        "cities.reindex([0, 5, 1, 7, 2])"
      ],
      "execution_count": 39,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>City name</th>\n",
              "      <th>Population</th>\n",
              "      <th>Area square miles</th>\n",
              "      <th>Population density</th>\n",
              "      <th>Is wide and has saint name</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>San Francisco</td>\n",
              "      <td>852469.0</td>\n",
              "      <td>46.87</td>\n",
              "      <td>18187.945381</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>5</th>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>San Jose</td>\n",
              "      <td>1015785.0</td>\n",
              "      <td>176.53</td>\n",
              "      <td>5754.177760</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>7</th>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>Sacramento</td>\n",
              "      <td>485199.0</td>\n",
              "      <td>97.92</td>\n",
              "      <td>4955.055147</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "       City name  Population  Area square miles  Population density  \\\n",
              "0  San Francisco    852469.0              46.87        18187.945381   \n",
              "5            NaN         NaN                NaN                 NaN   \n",
              "1       San Jose   1015785.0             176.53         5754.177760   \n",
              "7            NaN         NaN                NaN                 NaN   \n",
              "2     Sacramento    485199.0              97.92         4955.055147   \n",
              "\n",
              "  Is wide and has saint name  \n",
              "0                      False  \n",
              "5                        NaN  \n",
              "1                       True  \n",
              "7                        NaN  \n",
              "2                      False  "
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 39
        }
      ]
    },
    {
      "metadata": {
        "id": "TJffr5_Jwqvd",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ### 解决方案\n",
        "\n",
        "点击下方，查看解决方案。"
      ]
    },
    {
      "metadata": {
        "id": "8oSvi2QWwuDH",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " 如果您的 `reindex` 输入数组包含原始 `DataFrame` 索引值中没有的值，`reindex` 会为此类“丢失的”索引添加新行，并在所有对应列中填充 `NaN` 值："
      ]
    },
    {
      "metadata": {
        "id": "yBdkucKCwy4x",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "cities.reindex([0, 4, 5, 2])"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "2l82PhPbwz7g",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " 这种行为是可取的，因为索引通常是从实际数据中提取的字符串（请参阅 [*pandas* reindex 文档](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.reindex.html)，查看索引值是浏览器名称的示例）。\n",
        "\n",
        "在这种情况下，如果允许出现“丢失的”索引，您将可以轻松使用外部列表重建索引，因为您不必担心会将输入清理掉。"
      ]
    }
  ]
}